Use Android llm benchmark runner #5094

huydhn · 2024-09-04T22:14:29Z

The current solution is to:

Run the benchmark activity
Poll the result for up to 10m as 1 is async as @kirklandsign points out. This looks like an arbitrary threshold for me, but it works so I think we can keep it for now until a better solution comes along
Store the JSON result as a local artifact so that we can download it later. I will have a follow-up PR to propose its format. I need to cat the result here instead of pulling the file from device as the latter ends up with permission error

Testing

https://github.com/pytorch/executorch/actions/runs/10731052861/job/29761525913

Download the artifacts from AWS and confirm that the benchmark_results.json file are there together with the instrument.log

{"generateEnd":1725590151329,"generateStart":1725590150258,"loadEnd":1725590150258,"loadStart":1725590149993,"tokens":"tokens/second: 115.530304"}

pytorch-bot · 2024-09-04T22:14:31Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/5094

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 0f45f39 with merge base ee752f0 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

kirklandsign · 2024-09-04T23:41:59Z

examples/demo-apps/android/LlamaDemo/android-llm-device-farm-test-spec.yml

+      - |
+        adb -s $DEVICEFARM_DEVICE_UDID shell am start -n com.example.executorchllamademo/.Benchmarking \
+        --es "model_dir" "/data/local/tmp/llama" \
+        --es "tokenizer_path" "/data/local/tmp/llama/tokenizer.bin"


Thank you! Exactly what i want to do :D

Need to find a way to wait for the file to result file to appear. shell am start is async

Oh, got it. TIL. I'm still working on this to make the command works, so stay tuned :)

Does this kind of stuff work? 🤔
adb shell while [ ! -f /data/local/tmp/result.txt ]; do sleep 1; done

adb shell doesn't like the way I write bash script, so I need to look for a work around by cat-ing the results. It works nonetheless, so I guess we are good :)

kirklandsign · 2024-09-05T03:46:37Z

Please format 🥲 https://github.com/google/google-java-format/releases download the binary and run... sorry no built in tool right now

kirklandsign · 2024-09-05T23:04:10Z

examples/demo-apps/android/LlamaDemo/android-llm-device-farm-test-spec.yml

+        # TODO (huydhn): Polling like this looks brittle, figure out if there is a better way to wait
+        # for the benchmark results
+        adb -s $DEVICEFARM_DEVICE_UDID shell run-as com.example.executorchllamademo \
+        while ! test -f files/benchmark_results.json; do echo "Waiting for benchmark results..."; sleep 30; done


Can we have a maximum timeout? Or just rely on GH to timeout?

Yup, we can have a maximum timeout, as GH action timeout is 1 hour which is a bit too long.

kirklandsign · 2024-09-05T23:05:03Z

examples/demo-apps/android/LlamaDemo/android-llm-device-farm-test-spec.yml

-      - adb -s $DEVICEFARM_DEVICE_UDID shell "ls -la /data/local/tmp/llama/"
+      - echo "Wait for the results"
+      - |
+        # TODO (huydhn): Polling like this looks brittle, figure out if there is a better way to wait


Curious what could be brittle with this approach?

I just feel that there might be a better approach out there, so put a TODO here to remind myself for now. I don't like pooling for results in general, feel like a waste of requests :)

facebook-github-bot · 2024-09-06T01:52:54Z

@huydhn has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Use Android llm benchmark runner

976cffd

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 4, 2024

huydhn added 2 commits September 4, 2024 15:47

Silly mistake

abe445b

Another copy/paste mistake

7fbf4d5

kirklandsign reviewed Sep 4, 2024

View reviewed changes

kirklandsign mentioned this pull request Sep 4, 2024

Update script to build and upload MiniBench artifacts #5017

Merged

huydhn added 2 commits September 4, 2024 17:08

I should have built this locally

a43abc2

Missing .get()

03119c2

huydhn added 6 commits September 4, 2024 20:53

Wrong activity name

b068276

Use LlmBenchmarkRunner

25dd5da

Merge branch 'main' into use-llm-benchmark-runner

b765c08

Use s22

f857d95

Use polling to wait for the benchmark results (for now)

7c45d62

Missing change

94c1231

kirklandsign reviewed Sep 5, 2024

View reviewed changes

huydhn added 3 commits September 5, 2024 18:33

Echo the results back

1b60eb1

Remove unused var

1b3d73a

Fix typo

0f45f39

huydhn marked this pull request as ready for review September 6, 2024 01:52

huydhn requested a review from kirklandsign September 6, 2024 01:52

kirklandsign approved these changes Sep 6, 2024

View reviewed changes

facebook-github-bot merged commit 40720f0 into main Sep 6, 2024
49 checks passed

facebook-github-bot deleted the use-llm-benchmark-runner branch September 6, 2024 04:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use Android llm benchmark runner #5094

Use Android llm benchmark runner #5094

Uh oh!

huydhn commented Sep 4, 2024 •

edited

Loading

Uh oh!

pytorch-bot bot commented Sep 4, 2024 •

edited

Loading

Uh oh!

kirklandsign Sep 4, 2024

Uh oh!

huydhn Sep 4, 2024

Uh oh!

kirklandsign Sep 4, 2024

Uh oh!

huydhn Sep 6, 2024 •

edited

Loading

Uh oh!

kirklandsign commented Sep 5, 2024

Uh oh!

kirklandsign Sep 5, 2024

Uh oh!

huydhn Sep 6, 2024

Uh oh!

kirklandsign Sep 5, 2024

Uh oh!

huydhn Sep 6, 2024 •

edited

Loading

Uh oh!

facebook-github-bot commented Sep 6, 2024

Uh oh!

Uh oh!

Uh oh!

Use Android llm benchmark runner #5094

Use Android llm benchmark runner #5094

Uh oh!

Conversation

huydhn commented Sep 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Testing

Uh oh!

pytorch-bot bot commented Sep 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/5094

✅ No Failures

Uh oh!

kirklandsign Sep 4, 2024

Choose a reason for hiding this comment

Uh oh!

huydhn Sep 4, 2024

Choose a reason for hiding this comment

Uh oh!

kirklandsign Sep 4, 2024

Choose a reason for hiding this comment

Uh oh!

huydhn Sep 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kirklandsign commented Sep 5, 2024

Uh oh!

kirklandsign Sep 5, 2024

Choose a reason for hiding this comment

Uh oh!

huydhn Sep 6, 2024

Choose a reason for hiding this comment

Uh oh!

kirklandsign Sep 5, 2024

Choose a reason for hiding this comment

Uh oh!

huydhn Sep 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Sep 6, 2024

Uh oh!

Uh oh!

Uh oh!

huydhn commented Sep 4, 2024 •

edited

Loading

pytorch-bot bot commented Sep 4, 2024 •

edited

Loading

huydhn Sep 6, 2024 •

edited

Loading

huydhn Sep 6, 2024 •

edited

Loading