-
Notifications
You must be signed in to change notification settings - Fork 99
Testing Android and iOS apps on OSS CI using Nova reusable mobile workflow
With the advent of new tools like ExecuTorch, it's now possible to run LLM inference locally on mobile devices using different models such as llama2. While it isn't hard to experiment with this new capability, test it out on your own devices, and see some results, it takes more efforts to automate this process and make it a part of the CI on various PyTorch-family repositories. To solve this challenge, PyTorch Dev Infra team are launching a new Nova reusable mobile workflow to do the heavy lifting for you when it comes to testing your mobile apps.
With this new reusable workflow, devs now can:
- Utilize our mobile infrastructure built on top of AWS Device Farm. It offers a wide variety of popular Android and iOS devices from phones to tablets.
- Write and run tests remotely on those devices like how you run them locally with your connected phones.
- Go beyond the emulator to stress test and benchmark your local LLM inference solutions on actual devices. This helps accurately answer the questions on how many token the solution could process per second and how much memory and power it needs.
- Debug hard-to-reproduce issues on devices that you don't have.
- Gather the results and share them with others via the familiar GitHub CI UX.
Let's say you are integrating a new ExecuTorch backend which improves llama2 inference performance. You have already run some prompts to confirm that the token per second (TPS) is higher that what's reported in https://github.com/pytorch/executorch/tree/main/examples/models/llama2#performance. The result looks good on your phones, so the next step is to confirm the value on CI. To do that, you will need a few things:
- Decide on a group of devices you want to run the test. Take Android as an example, you might want to run it on the recent Samsung Galaxy S2x. Such a group of devices has already been created in our infra under the ARN
arn:aws:devicefarm:us-west-2:308535385114:devicepool:02a2cf0f-6d9b-45ee-ba1a-a086587469e6/e59f866a-30aa-4aa1-87b7-4510e5820dfa
. - Build the app that you want to test. It would be in the
.apk
format for Android and.ipa
format for iOS. - Prepare the test to run, we are supporting two types of tests at the moment:
- Instrumented tests on Android https://developer.android.com/training/testing/instrumented-tests
- and XCTest on iOS https://developer.apple.com/documentation/xctest
- Prepare an optional zip archive of any data files you want to copy to the remote devices. This usually contains the exported models themselves.
- On Android, the archive will be extracted to
/sdcard/
directory. - On iOS, the files will be on the application sandbox.
- On Android, the archive will be extracted to
After having these items ready, the next step is to take a minute a look at the test specification which codify how the test is run. You probably could just use the default test spec that we provides, but knowing the steps would come in handy if you need to customize. Here are some examples:
- The Android test spec for ExecuTorch Llama app can be found in https://ossci-assets.s3.amazonaws.com/android-llama2-device-farm-test-spec.yml. It prepares the required folder
/data/local/tmp/llama/
and copy the exported modelxnnpack_llama2.pte
together with the tokenizertokenizer.bin
there before running the test.$DEVICEFARM_DEVICE_UDID
is set by AWS Device Farm to be the target device, and the output will be available in$DEVICEFARM_LOG_DIR/instrument.log
.
...
test:
commands:
# By default, the following ADB command is used by Device Farm to run your Instrumentation test.
# Please refer to Android's documentation for more options on running instrumentation tests with adb:
# https://developer.android.com/studio/test/command-line#run-tests-with-adb
- echo "Starting the Instrumentation test"
- |
adb -s $DEVICEFARM_DEVICE_UDID shell "am instrument -r -w --no-window-animation \
$DEVICEFARM_TEST_PACKAGE_NAME/$DEVICEFARM_TEST_PACKAGE_RUNNER 2>&1 || echo \": -1\"" |
tee $DEVICEFARM_LOG_DIR/instrument.log
...
- The generic iOS test spec used by ExecuTorch iOS demo app is at https://ossci-assets.s3.amazonaws.com/default-ios-device-farm-appium-test-spec.yml just invokes
xcodebuild test-without-building
on the target device.
test:
commands:
- xcodebuild test-without-building -destination id=$DEVICEFARM_DEVICE_UDID -xctestrun $DEVICEFARM_TEST_PACKAGE_PATH/*.xctestrun -derivedDataPath $DEVICEFARM_LOG_DIR
If you have a custom test spec, you'll need to upload them somewhere downloadable by the workflow.
Let's bring everything together and go through an actual example of https://github.com/pytorch/executorch/blob/main/.github/workflows/android.yml.
name: Android
on:
...
jobs:
# Build all the demo apps
test-demo-android:
name: test-demo-android
uses: pytorch/test-infra/.github/workflows/linux_job.yml@main
strategy:
matrix:
include:
- build-tool: buck2
with:
runner: linux.12xlarge
docker-image: executorch-ubuntu-22.04-clang12-android
submodules: 'true'
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
timeout: 90
# The apps are built using Nova reusable GH action, so we set the upload-artifact parameter here to make them available as artifacts on GitHub
upload-artifact: android-apps
script: |
set -eux
... Building the apps ...
# In Nova workflow, all the files under artifacts-to-be-uploaded folder will be uploaded
mkdir -p artifacts-to-be-uploaded
# Copy the app and its test suite to S3
cp examples/demo-apps/android/LlamaDemo/app/build/outputs/apk/debug/*.apk artifacts-to-be-uploaded/
cp examples/demo-apps/android/LlamaDemo/app/build/outputs/apk/androidTest/debug/*.apk artifacts-to-be-uploaded/
# Also copy the share libraries
cp cmake-out-android/lib/*.a artifacts-to-be-uploaded/
# Upload the app and its test suite to S3 so that they can be downloaded by the test job
upload-artifacts:
needs: test-demo-android
runs-on: linux.2xlarge
steps:
- name: Download the artifacts
uses: actions/download-artifact@v3
with:
# The name here needs to match the name of the upload-artifact parameter
name: android-apps
path: ${{ runner.temp }}/artifacts/
- name: Verify the artifacts
shell: bash
working-directory: ${{ runner.temp }}/artifacts/
run: |
ls -lah ./
- name: Upload the artifacts to S3
uses: seemethere/upload-artifact-s3@v5
with:
s3-bucket: gha-artifacts
s3-prefix: |
${{ github.repository }}/${{ github.run_id }}/artifact
retention-days: 14
if-no-files-found: ignore
path: ${{ runner.temp }}/artifacts/
# Run the test on remote Android devices
test-llama-app:
needs: upload-artifacts
permissions:
id-token: write
contents: read
uses: pytorch/test-infra/.github/workflows/mobile_job.yml@main
with:
device-type: android
runner: ubuntu-latest
test-infra-ref: ''
# This is the ARN of ExecuTorch project on AWS
project-arn: arn:aws:devicefarm:us-west-2:308535385114:project:02a2cf0f-6d9b-45ee-ba1a-a086587469e6
# This is the custom Android device pool that only includes Samsung Galaxy S2x
device-pool-arn: arn:aws:devicefarm:us-west-2:308535385114:devicepool:02a2cf0f-6d9b-45ee-ba1a-a086587469e6/e59f866a-30aa-4aa1-87b7-4510e5820dfa
# Uploaded to S3 from the previous job, the name of the app comes from the project itself
android-app-archive: https://gha-artifacts.s3.amazonaws.com/${{ github.repository }}/${{ github.run_id }}/artifact/app-debug.apk
android-test-archive: https://gha-artifacts.s3.amazonaws.com/${{ github.repository }}/${{ github.run_id }}/artifact/app-debug-androidTest.apk
# The test spec can be downloaded from https://ossci-assets.s3.amazonaws.com/android-llama2-device-farm-test-spec.yml. A link to download the spec also works here.
test-spec: arn:aws:devicefarm:us-west-2:308535385114:upload:02a2cf0f-6d9b-45ee-ba1a-a086587469e6/abd86868-fa63-467e-a5c7-218194665a77
# The exported llama2 model and its tokenizer, can be downloaded from https://ossci-assets.s3.amazonaws.com/executorch-android-llama2-7b.zip. A link to download the archive also works here, but keep in mind that some exported models like llama2 7B is few GB in size, so it would be faster to upload it to AWS beforehand and reuse the existing resource if possible
extra-data: arn:aws:devicefarm:us-west-2:308535385114:upload:02a2cf0f-6d9b-45ee-ba1a-a086587469e6/bd15825b-ddab-4e47-9fef-a9c8935778dd
pytorch/test-infra/.github/workflows/mobile_job.yml
is the one doing the heavy lifting here. It can be tweaked with the following parameters
-
device-type
: eitherandroid
orios
-
project-arn
: this value is fixed for each project, please reach out to PyTorch Dev Infra if you need to get one. There are 2 available projects atm.-
arn:aws:devicefarm:us-west-2:308535385114:project:b531574a-fb82-40ae-b687-8f0b81341ae0
for PyTorch core. - and
arn:aws:devicefarm:us-west-2:308535385114:project:02a2cf0f-6d9b-45ee-ba1a-a086587469e6
for ExecuTorch.
-
-
device-pool-arn
: this is the pool of remote devices to run the test. By default, it will select 5 random popular devices for the test. Please also reach out to PyTorch Dev Infra if you need something more specific. Please note that the app itself can limit which devices it can use, for example, havingIPHONEOS_DEPLOYMENT_TARGET
set to 17 will exclude all devices with lower iOS version.