Skip to content

Commit a91eb31

Browse files
Di Xu (SWE)facebook-github-bot
authored andcommitted
Add support to export XNNPACK based static_llama
Summary: Add support to export XNNPACK based static_llama - static_llama is the QNN backend hybrid/prefill+decode model with KV cache as the inference input - https://www.internalfb.com/code/fbsource/fbcode/executorch/examples/qualcomm/oss_scripts/llama2/model/static_llama.py Differential Revision: D67867190
1 parent 68c0208 commit a91eb31

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

examples/models/llama/export_llama_lib.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -79,7 +79,7 @@
7979
verbosity_setting = None
8080

8181

82-
EXECUTORCH_DEFINED_MODELS = ["stories110m", "llama2", "llama3", "llama3_1", "llama3_2"]
82+
EXECUTORCH_DEFINED_MODELS = ["stories110m", "llama2", "llama3", "llama3_1", "llama3_2", "static_llama"]
8383
TORCHTUNE_DEFINED_MODELS = ["llama3_2_vision"]
8484

8585

@@ -649,6 +649,7 @@ def _validate_args(args):
649649
)
650650

651651

652+
# TODO: export static_llama via XNNPACK
652653
def _export_llama(args) -> LLMEdgeManager: # noqa: C901
653654
_validate_args(args)
654655
pt2e_quant_params, quantizers, quant_dtype = get_quantizer_and_quant_params(args)

0 commit comments

Comments
 (0)