OSS CI: Test our endorsed llama path

mergennachin · facebook-github-bot · commit 6a96e090b3c2 · 2024-04-05T16:15:46.000-07:00
Summary:
Currently, in OSS CI, we are testing xnnpack path, PT2E PTQ path.

However, I realized that we don't actually have a test to exercise what's in the README file.

Even for stories, let's test this path. I tested locally and it works.

bypass-github-export-checks
bypass-github-pytorch-ci-checks
bypass-github-executorch-ci-checks

Reviewed By: lucylq

Differential Revision: D55822681

fbshipit-source-id: 794ef7c7c66ffe1ffb3d6971cea5c5dd5c736a8c
diff --git a/.ci/scripts/test_llama.sh b/.ci/scripts/test_llama.sh
@@ -118,7 +118,7 @@ EXPORTED_MODEL_NAME="${EXPORTED_MODEL_NAME}.pte"
 echo "Exporting ${EXPORTED_MODEL_NAME}"
 EXPORT_ARGS="-c stories110M.pt -p ${PARAMS} -d ${DTYPE} -n ${EXPORTED_MODEL_NAME}"
 if [[ "${MODE}" == "xnnpack" ]]; then
-  EXPORT_ARGS="${EXPORT_ARGS} --pt2e_quantize xnnpack_dynamic"
+  EXPORT_ARGS="${EXPORT_ARGS} -kv --use_sdpa_with_kv_cache -X -qmode 8da4w -G 128"
 fi
 $PYTHON_EXECUTABLE -m examples.models.llama2.export_llama ${EXPORT_ARGS}