Skip to content

Add llama jobs on Arm64 and reduce llama jobs on MacOS #9251

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 14, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 16 additions & 7 deletions .github/workflows/pull.yml
Original file line number Diff line number Diff line change
Expand Up @@ -136,6 +136,7 @@ jobs:
PYTHON_EXECUTABLE=python bash .ci/scripts/test_model.sh "${MODEL_NAME}" "${BUILD_TOOL}" "${BACKEND}"

test-llama-runner-linux:
# Test Both linux x86 and linux aarch64
name: test-llama-runner-linux
uses: pytorch/test-infra/.github/workflows/linux_job_v2.yml@main
permissions:
Expand All @@ -144,21 +145,29 @@ jobs:
strategy:
matrix:
dtype: [fp32]
mode: [portable, xnnpack+custom, xnnpack+custom+qe,xnnpack+custom+quantize_kv,xnnpack+quantize_kv]
mode: [xnnpack+custom+qe,xnnpack+custom+quantize_kv,xnnpack+quantize_kv]
runner: [linux.2xlarge, linux.arm64.2xlarge]
docker-image: [executorch-ubuntu-22.04-clang12, executorch-ubuntu-22.04-gcc11-aarch64]
include:
- dtype: bf16
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be useful to test portable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is part of the trunk job now

mode: portable
- dtype: bf16
mode: custom
runner: linux.2xlarge
docker-image: executorch-ubuntu-22.04-clang12
# Excluding specific runner + docker image combinations that don't make sense:
# - Excluding the ARM64 gcc image on the x86 runner (linux.2xlarge)
# - Excluding the x86 clang image on the ARM64 runner (linux.arm64.2xlarge)
exclude:
- runner: linux.2xlarge
docker-image: executorch-ubuntu-22.04-gcc11-aarch64
- runner: linux.arm64.2xlarge
docker-image: executorch-ubuntu-22.04-clang12
fail-fast: false
with:
runner: linux.2xlarge
docker-image: executorch-ubuntu-22.04-clang12
runner: ${{ matrix.runner }}
docker-image: ${{ matrix.docker-image }}
submodules: 'true'
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
timeout: 900
upload-artifact: android-models
upload-artifact-to-s3: true
script: |
# The generic Linux job chooses to use base env, not the one setup by the image
CONDA_ENV=$(conda env list --json | jq -r ".envs | .[-1]")
Expand Down
62 changes: 58 additions & 4 deletions .github/workflows/trunk.yml
Original file line number Diff line number Diff line change
Expand Up @@ -283,18 +283,72 @@ jobs:
# Test ANE llama
${CONDA_RUN} sh .ci/scripts/test_ane_static_llama.sh

test-llama-runner-macos:
name: test-llama-runner-mac
uses: pytorch/test-infra/.github/workflows/macos_job.yml@main
test-llama-runner-linux:
# Test Both linux x86 and linux aarch64
name: test-llama-runner-linux
uses: pytorch/test-infra/.github/workflows/linux_job_v2.yml@main
permissions:
id-token: write
contents: read
strategy:
matrix:
dtype: [fp32]
mode: [portable, xnnpack+kv+custom, mps, coreml, xnnpack+custom+quantize_kv]
mode: [portable, xnnpack+custom]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jackzhxng here

runner: [linux.2xlarge, linux.arm64.2xlarge]
docker-image: [executorch-ubuntu-22.04-clang12, executorch-ubuntu-22.04-gcc11-aarch64]
include:
- dtype: bf16
mode: portable
runner: linux.2xlarge
docker-image: executorch-ubuntu-22.04-clang12
- dtype: bf16
mode: portable
runner: linux.arm64.2xlarge
docker-image: executorch-ubuntu-22.04-gcc11-aarch64
- dtype: bf16
mode: custom
runner: linux.arm64.2xlarge
docker-image: executorch-ubuntu-22.04-gcc11-aarch64
# Excluding specific runner + docker image combinations that don't make sense:
# - Excluding the ARM64 gcc image on the x86 runner (linux.2xlarge)
# - Excluding the x86 clang image on the ARM64 runner (linux.arm64.2xlarge)
exclude:
- runner: linux.2xlarge
docker-image: executorch-ubuntu-22.04-gcc11-aarch64
- runner: linux.arm64.2xlarge
docker-image: executorch-ubuntu-22.04-clang12
fail-fast: false
with:
runner: ${{ matrix.runner }}
docker-image: ${{ matrix.docker-image }}
submodules: 'true'
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
timeout: 900
script: |
# The generic Linux job chooses to use base env, not the one setup by the image
CONDA_ENV=$(conda env list --json | jq -r ".envs | .[-1]")
conda activate "${CONDA_ENV}"

DTYPE=${{ matrix.dtype }}
BUILD_TOOL="cmake"
MODE=${{ matrix.mode }}
ARTIFACTS_DIR_NAME="artifacts-to-be-uploaded/${DTYPE}-${MODE}"
ARTIFACTS_DIR_NAME="${ARTIFACTS_DIR_NAME/+/-}"

# Setup executorch
PYTHON_EXECUTABLE=python bash .ci/scripts/setup-linux.sh --build-tool "${BUILD_TOOL}"
# Install requirements for export_llama
PYTHON_EXECUTABLE=python bash examples/models/llama/install_requirements.sh
# Test llama2
PYTHON_EXECUTABLE=python bash .ci/scripts/test_llama.sh -model stories110M -build_tool "${BUILD_TOOL}" -dtype "${DTYPE}" -mode "${MODE}" -upload "${ARTIFACTS_DIR_NAME}"

test-llama-runner-macos:
name: test-llama-runner-mac
uses: pytorch/test-infra/.github/workflows/macos_job.yml@main
strategy:
matrix:
dtype: [fp32]
mode: [mps, coreml, xnnpack+custom+quantize_kv]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about xnnpack+custom+qe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

basically, i'm reducing the test coverage on iOS, and relying on Arm64 runners

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And it is tested on the pull job

fail-fast: false
with:
runner: macos-m1-stable
Expand Down
Loading